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Abstract 

Let {Xn)n>o be an irreducible, aperiodic, and homogeneous binary 
Markov chain and let L/„ be the length of the longest (weakly) increas- 
ing subsequence of (-^fc)i<fc<n- Using combinatorial constructions and 
weak invariance principles, we present elementary arguments leading 
to a new proof that (after proper centering and scaling) the limiting 
law of Lin is the maximal eigenvalue of a 2 x 2 Gaussian random 
matrix. In fact, the limiting shape of the RSK Young diagrams asso- 
ciated with the binary Markov random word is the spectrum of this 
random matrix. 
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1 Introduction 



The identification of tlie limiting distribution of the length of the longest 
increasing subsequence of a random permutation or of a random word has 
attracted a lot of interest in the past decade, in particular in light of its con- 
nections with random matrices (see [BEIIS], [6], [8], [12], [l3l[Tl], [IS], 
|17j . [T8]). For random words, both the iid uniform and non- uniform settings 
are understood, leading respectively to the maximal eigenvalue of a traceless 
(or generalized traceless) element of the Gaussian Unitary Ensemble (GUE) 
as limiting laws of L/„. In a dependent framework, Kuperberg [16] con- 
jectured that if the word is generated by an irreducible, doubly-stochastic, 
cyclic, Markov chain with state space an ordered m- letter alphabet, then the 
limiting distribution of the length Lin is still that of the maximal eigenvalue 
of a traceless m x m element of the GUE. More generally, the conjecture 
asserts that the shape of the Robinson-Schensted-Knuth (RSK) Young dia- 
grams associated with the Markovian random word is that of the joint dis- 
tribution of the eigenvalues of a traceless m x m element of the GUE. For 
m = 2, Chistyakov and Gotze [7] positively answered this conjecture, and in 
the present paper this result is rederived in an elementary way. 

The precise class of homogeneous Markov chains with which Kuperberg's 
conjecture is concerned is more specific than the ones we shall study. The 
irreducibility of the chain is a basic property we certainly must demand: each 
letter has to occur at some point following the occurrence of any given letter. 
The cyclic (also called circulant) criterion, i.e., the Markov transition matrix 
P has entries satisfying pij = pj+ij+i, for 1 < i, j < m (where m + 1 = 1), 
ensures a uniform stationary distribution. 

Let us also note that Kuperberg implicitly assumes the Markov chain to 
also be aperiodic. Indeed, the simple 2-state Markov chain for the letters 
tti and a2 described by P(X„+i = ai|X„ = aj) = 1 for i 7^ j, produces a 
sequence of alternating letters, so that L/„ is always either n/2 or n/2 + 1, for 
n even, and (n+l)/2, for n odd, and so has a degenerate limiting distribution. 
Even though this Markov chain is irreducible and cyclic, it is periodic. 

By the end of this introduction, the reader might certainly have wondered 
how the binary results do get modified for ordered alphabets of arbitrary fixed 
size m. As shown in [10], for m = 3, Kuperberg's conjecture is indeed true. 
However, for m > 4, this is no longer the case; and some, but not all, cyclic 
Markov chains lead to a limiting law as in the iid uniform case. 
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2 Combinatorics 



As in [9j, one can express Lin in a combinatorial manner. For convenience, 
this short section recapitulates that development. 

Let (X„)„>i consist of a sequence of values taken from an m-letter ordered 
alphabet. Am = {0.1 < 0:2 < ■ ■ ■ < am}- Let be the number of occur- 
rences of ar among (Xj)i<j<fc. Each increasing subsequence of (Xj)i<j<fc 
consists simply of consecutive identical values, with these values forming 
an increasing subsequence of a^- Moreover, the number of occurrences of 
ar G among (Xj)fc+i<j<£, where 1 < < £ < n, is simply 

a\ — a^. The length of the longest increasing subsequence of Xi, X2, . . . , X„ 
is thus given by 

LIn= max [(al-al) + {al-al) + --- + {a';:-aZJ], (2.1) 

i.e., 

Lin = max [{al - alj + {a^ - alj + ■ ■ ■ + (a--\ - a^J + a^T], (2-2) 

0<K1<-- 

<km-i<n 

where Oq = 0. 

For i = 1, ... ,71 and r = 1 , . . . , m — 1 , let 

{1, if Xi = ar, 
-1, ifX, = a,+i, (2.3) 
0, otherwise, 

and let SJl = Yli=i ^i-> ^ = 1, . . . ,n, with also Sq = 0. Then clearly = 
a^. — a]^^ . Hence, 

L/„= max {Sl + Sl + --- + SZ-\ + a^}. (2.4) 

0<fci<-- 

By the telescoping nature of the sum Yl^=r — Yll^=r^'^n " '^n^"'^)' 
find that, for each 1 < r < m — 1, aj^ = + YUkZr ^n- Since a^, . . . , 
must evidently sum up to A;, we have 
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n 



J2< 

r=l 

m—l / 

r=l \ 
m—l 



r=l 



Solving for a™ gives us 

^ m—l 

m m ^ 

r=l 

Substituting into (12 ■4p . we finally obtain 



^ m—l 

LIn = ---y^rSl+ max {5^^ + + . . . + 5^"^- (2-5) 

m o<fci<- ' ^ "'"-^ 

<km-i<n 

As emphasized in [9J, fl2.5p is of a purely combinatorial nature or, in more 
probabilistic terms, is of a pathwise nature. We now proceed to analyze (I2.5p 
for a binary Markovian sequence. 



3 Binary Markovian Alphabet 

In the context of binary Markovian alphabets, (X„)„>o is described by the 
following transition probabilities between the two states (which we identify 
with the two letters ai and 02): P(X„+i = a2\Xn = cti) = a and P(X„+i = 
a;i|X„ = 02) = b, where < a + 6 < 2. We later examine the degenerate 
cases a = b = and a = b = 1. In keeping with the common usage within the 
Markov chain literature, we begin our sequence at n = 0, although our focus 
will be on n > 1. Denoting by (p^,p^) the vector describing the probability 
distribution on {Q;i,a2} at time n, we have 
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The eigenvalues of the matrix in (13.11) are Ai = 1 and — 1 < A2 = 1 — 
a — b < 1, with respective left eigenvectors (7ri,7r2) = (6/(a + b),a/{a + b)) 
and (1, —1). Moreover, (tti, 712) is also the stationary distribution. Given any 
initial distribution (pq,Pq), we find that 

as n — )■ 00, since IA2I < 1. 

Our goal is now to use these probabilistic expressions to describe the 
random variables Zl and Si defined in the previous section. (We retain the 
redundant superscript "1" in Zl and Si in the interest of uniformity.) 

Setting (3 = ap\ — bpl, we easily find that 

^t:Jt + 2±-Xl, (3.3) 

a+b a+b ^ ^ ' 

for each \ <k <n. Thus, 

^Sl-'-^...{^M\^\. (3,4) 



'1\2 



a + 6 \a + 6/\l — A2 

and so ^Sljk — )• (6 — a)/(a + 6), as A; — )■ 00. 

Turning to the second moments of Zl and Si, first note that ^{Zl 
1, since {Z\)^ = 1 a.s. Next, we consider KZlZj, for k < i. Using the 
Markovian structure of (X„)„>o, it quickly follows that 



P((Xfc,X,) = {xk,x,)) 



' (tti + (tti + A^^) , if (xfc, X() = (ai, ai), 

(^1 - ^i~''^b) (^2 - A|^) , if {xk, xe) = (ai, 02), 

(7^2 - A^'^) (vTi + Xl^) , if (xfc, xe) = («2, «i), 

. (^2 + A^^^^) (vr2 - A^^) , if (xfc, X,) = («2, «2). 



(3.5) 
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For simplicity, we will henceforth assume that our initial distribution is 
the stationary one, i.e., {VoiVD = (''i"i5 7r2). (this assumption can be dropped 
as explained in the Concluding Remarks of |10]). Under this assumption, 
(3 = 0, KSl = kfi, where /x = KZl = {b — a)/{a + b), and fl3.5p simplifies to 



' (tti + Aa"''^) TTi, if (xfc, xdi = (ai, ai), 

(vTi - Xi^''^) 7i2, if (xfc, Xi) = (ai, as), 

(vTa - As^''^) TTi, if (xfc, Xi) = (aa, ai), 

, (tts + Aa"''^) vTa, if (x^, x^) = (as, aa). 



(3.6) 



We can now compute EZ^Zj: 



EZ^Z; = FiZlZl = +1) - P(Z^Z; = -1) 
= F{{Xk,Xe) G {(ai,ai),(a2,a2)}) 

-F{{Xk,Xe) G {(ai,a2),(a2,ai)}) 

= ( ^i' + A2"'^vri + vr^ + A^'^^tts ) 
\ a + o a + b / 

7ri7r2 - A2 — —rT^2 + ttitts - A2 — —ri^i 
a + b a + b J 

= i"^+"^+(^^^ j-r^"^-(^^^ J 

^ (6 - g)^ 4a& 

(a + 6)2 (a + 6)2 ^ • 



Hence, recalling that /3 = 0, 



2 := VarZl = 1 



I, \ 2 
b — a 



a 



for all > 1, and, for k < i, the covariance of and Zj is 
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Proceeding to the covariance structure of Si, we first find that 

k 

VaiSl = VarZj + 2 ^ Cov(Z], Z}) 

j=i j<e 

j<e 

'X^+^ -kXl + {k-l)X2 



(1 - A2)2 



^M^V + 2^'f¥^^)- (3-10) 



1-Xj V(l-A2)2 

Next, for k < £, and using (13. 9 p and fl3.10p . the covariance of Si and S} 
is given by 



k I 

Co^{SlSl) = Y,Y.^ov{ZlZ]) 

1=1 j=i 

k k I 

= ^VarZ,^ + 2 Co^{ZlZ]) + Y. E Cov(Z^ Z]) 

«=1 i<j<k i=l j=k+l 

k e 

= Ve.TSl + Y E Cov(Z^4) 

■A2(l-A^)(l-An 



Var5.^ + 



a 



1 + A; 
1-A, 



A; 



(l-A^)^ 
A2(l-A^)(l + Ar^: 

(1-A2)^ 



(3.11) 



From (I3.10p and (13.1 ip we see that, as A; — )■ oo, 

1 + A2~ 



VarSi , 
k 



1 - X, 



(3.12) 



and, moreover, as A £ — )■ oo. 



When a = b, ES*^ = 0, and in (13.121) the asymptotic variance becomes 



Varg^ 4a^ / ! + (l-2a) \ 
~k ^ (2a)2 Vl- (l-2a) J 



For a small, we have a " lazy" Markov chain, that is, a Markov chain which 
tends to remain in a given state for long periods of time. In this regime, the 
random variable Si has long periods of increase followed by long periods of 
decrease. In this way, linear asymptotics of the variance with large constants 
occur. If, on the other hand, a is close to 1, the Markov chain rapidly shifts 
back and forth between ai and 0^2, and so the constant associated with the 
linearly increasing variance of Si is small. 

As in [9], Brownian functionals play a central role in describing the lim- 
iting distribution of LJ„. To move towards a Brownian functional expression 
for the limiting law of define the polygonal function 



S}., - [nt]fi (nt - [nt])(Z}n,-. - /i) 

av/n(l + A2)/(l-A2) av/n(l + A2)/(l-A2) 

for < t < 1. In our finite-state, irreducible, aperiodic, stationary Markov 
chain setting, we may conclude that Bn =^ B, as desired. (See, for example, 
the more general settings for Gordin's martingale approach to dependent 
invariance principles, and the stationary ergodic invariance principle found 
in Theorem 19.1 of [5].) 

Turning now to LJ„, we see that for the present 2-letter situation, (12.51) 
simply becomes 

Llr) ~~ — — S,^ ~\~ max Su. 

2 2 " l<fc<n ^ 

To find the limiting distribution of L/„ from this expression, recall that 
TTi = 6/(a + 6), = a/{a + h), fi = tti — 712 = (6 — a)/(a + 6), = 
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4:ab/{a + 6)^, and that A2 = 1 — a — b. Define T^max = niax{7ri, 7r2} and 
5-2 = a\l + A2)/(l - A2). Rewriting (KV^ as 

A Sl^t] - Mix {nt - [nt]){Zl^,^_^^ - /i) 

-Dnl^rj = —= h 



LJ„ becomes 

n 1 



= n-n2 - ^ (^dry/nBn{l)^ + max {ay/nBn{t) + (tti - 7r2)r7,t^ 
= ^vr„„^ - i (^crv/ri5„(l) j 

+ max {ay/nBn{t) + (tti - 7r2)nt - {-Kmax - T^2)n^ ■ (3.15) 

This immediately gives 



= -o^nllj 



'n \ 

+ max ( Bn{t) + -^((VTI - 7r2)t - {-Kmax " 7r2)) . (3.16) 

o<t<i V ^ / 

Let us examine (13.161) on a case-by-case basis. First, if vr^ax = tti = 7r2 = 
1/2, z.e., if a = 6, then a = 1 and a = (1 — a)/a, and so fl3.16p becomes 



= -^^"(1) + ^fl, Bn{t). (3.17) 
a/(1 — a)n/a ^ o<i<i 

Then, by the Invariance Principle and the Continuous Mapping Theorem, 
Next, if TTmax = T^2 > T^x-, (I3.16p becomes 
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+ max ( Bn{t) —{-Rmax " TT^t ) . (3.19) 

o<t<i \ a 



On the other hand, if vTmax = tti > 712, fl3.16p becomes 



= -o^nllj 



' TL 

+ max ( Bn{t) - ^{iXmax " 7r2)(l - t) 
o<t<i V cr 

Ibji] 



Tl 

+ max ( Bn{t) - Bn{l) - ^{l^max - 7r2)(l - t) 

o<t<i V ^ 

(3.20) 

In both fl3.19p and O3.20p we have a term in our maximal functional which 
is linear in t or 1 — t, with a negative slope. We now show, in an elementary 
fashion, that in both cases, as n — )■ cxd, the maximal functional goes to zero 
in probability. 

Consider first fl3.19p . Let c„ = y/rHj^^ax — t^a)!^ > 0, and for any c > 0, 
let Mc = maxo<i<i(-B(t) — ct), where B{t) is a standard Brownian motion. 
Now for n large enough, 

Bnit) -Ct> Bn{t) - Cnt 

a.s., for all < t < 1. Then for any z > 0, and n large enough. 



P(max(5„(t) - cj) > z) < P(max(5„(t) - ct) > z), (3.21) 
and so by the Invariance Principle and the Continuous Mapping Theorem, 



limsupP(max(5„(t) - c„t) > z) < lim F(max (BJt) - ct) > z) 

n^oo 0<t<l n^oo 0<i<l 

= P(Mc > z). (3.22) 
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Now, as is well-known, P(Mc > ^2) — )■ as c — )■ oo. One can confirm this 
intuitive fact with the following simple argument. For 2; > 0, c > 0, and 
< £ < 1, we have that 



P(Mc > z) < P(max(S(t) - ct) > z) + ¥{max{B{t) - ct) > z) 

0<t<e e<t<l 

< P(max B{t) > z)+ P(max(5(t) - ce) > z) 

0<t<e e<t<l 

< P(max Bit) > z)+ P(max Bit) > ce + z) 

0<t<e 0<t<l 

= 2 (^1-$ (^-^^^ +2(l-<l>(c5 + z)). (3.23) 

But, as c and e are arbitrary, we can first take the limsup of fl3.23p as c — )■ 00, 
and then let e — 0, proving the claim. 
We have thus shown that 

limsupP(max(S„(i(:) - Cnt) > z) <0, 

n— ^00 0<t<l 

and since the functional clearly is equal to zero when t = 0, we have 

max(5„(t) - c„t) A 0, (3.24) 

as n — )■ 00. Thus, by the Continuous Mapping Theorem, and the Converging 
Together Lemma, we obtain the weak convergence result 

L4 - mr^ax Ab(1). (3.25) 



Lastly, consider f l3.20p . Here we need simply note the following equal- 
ity in law, which follows from the stationary and Markovian nature of the 
underlying sequence (X„)„>o: 



B^{t) - 5„(1) - ^(w - 7r2)(l - t) 
a 



-S„(l-t)-^(w-7r2)(l-t), (3.26) 
a 
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for t = 0,1/n, . . . , {n — l)/n,l. With a change of variables {u = 1 — t), and 
noting that B{t) and —B{t) are equal in law, our previous convergence result 
IK2^ implies that 



max(E„(t) - B„,{1) - c„(l - t)) = max (-5„,(u) - c„u) A 0, (3.27) 

0<t<l 0<it<l 

as n — > oo. Our limiting functional is thus of the form 



Since -B(l) is simply a standard normal random variable, the different signs 
in f l3.25p and fl3.28p are inconsequential. 

Finally, consider the degenerate cases. If either a = or 6 = 0, then the 
sequence (X„)„>o will be a.s. constant, regardless of the starting state, and 
so Lin ~ n. On the other hand, if a = 6 = 1, then the sequence oscillates 
back and forth between ai and a2, so that LJ„ ^ njl. Combining these 
trivial cases with the previous development, gives: 

Theorem 3.1 Let (X„)„>o he a 2-state Markov chain, with P(X„+i = 
Xn = ai) = a and P(X„+i = q;i|X„ = 0:2) = b. Let the law of Xq be the 
invariant distribution {tti,tt2) = {b/{a + b),a/{a + b)), forO < a + b <2, and 
be (tti, 112) = (1, 0), for a = b = 0. Then, for a = b > 0, 



Lin n/2 _ Jl af ^^(i) + ^^^^ 5^^) 1 (3 29) 



\ a \ 2 o<t<i 
where (i?(t))o<t<i is a standard Brownian motion. For ay^bora = b = 0, 

^/n-^Vr^ax ^^^Q^-2/4)^ (3.30) 



with TTmax = max{7ri, 712}, and where N{0, cr^/4) is a centered normal random 
variable with variance = ab{2 — a — b)/{a + bY , for a ^ b, and a"^ = 0, 
for a = b = 0. (If a = b = 1, or a"^ = 0, then the distributions in f l3.29p and 
fl3.30p . respectively, are understood to be degenerate at the origin.) 

To extend this result to the entire RSK Young diagrams, let us introduce 
the following notation. By 
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(Y(1) y(2) Y(k)^ (y(1) y(2) 



(3.31) 



we shall indicate the weak convergence of the joint law of the fc-vector 
(yi'\ . . . , Ki'^) to that of (Y^^, . . . , Y^^), as n ^ oo. Since LI^ 
is the length of the top row of the associated Young diagrams, the length of 
the second row is simply n — Lin. Denoting the length of the i^^ row by R^, 
fl3.3ip . together with an application of the Cramer- Wold Theorem, recovers 
the result of Chistyakov and Gotze [7] as part of the following easy corollary, 
which is in fact equivalent to Theorem I3.lt 



Corollary 3.1 For the sequence in Theorem \3.1\ if a = b > 0, then 
( Ri-n/2 Rl-n/2 \ ^ _ 

[-^y^^^y^) ^yoo-iR^,RJ, (3.32) 



where the law ofY^ is supported on the 2""^ main diagonal o/M^, and with 



Rio = ( -^5(1) + max Bit) 



If a ^ b or a = b = 0, then setting iTmin = min{7ri, tt2}, we have 

^iV((0,0),S), (3.33) 



Rn ^'^max R-n flTTj^ 



'n \/n 
where S is the covariance matrix 



, 1 -1 



where a"^ = 4a6(2 — a — b)/{a + bY , for a^b, and cP' = 0, for a = b = 0. 

Remark 3.1 The joint distributions in (13.321) and (13.331) are of course de- 
generate, in that the sum of the two components is a.s. identically zero in 
each case. In (13.321) . the density of the first component of Rex, is easy to find, 
and is given by (e.g., see /7i]/) 



16 



3/2 



fiy) = ^{r—] y'e-'-'/^'~^\ y>0. (3.34) 
V 1 - a' 
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As in Chistyakov and Gotze fl3.32l) can then be stated as: For any 
bounded, continuous function : — )■ M, 



Ri - n/2 Rl - n/2 



lim g , , 

V \ V (1 - ct)"'/ct ^J{l-a)n/c 

POO 

= 2^271 / g{x, -x)(j)GUE,2{x, -x)dx, 



Jo 

where (pGUE,2 is the density of the eigenvalues of the 2x2 GUE, and is given 
by 

To see the GUE connection more explicitly, consider the 2x2 traceless 
GUE matrix 



Mn 



Xi Y + iZ 
Y -iZ X2 



where Xi, X2, Y , and Z are centered, normal random variables. Since Gorr (Xi, X2) 
— 1, the largest eigenvalue of Mq is 

Ai,o = ^Jxl + Y^ + Z\ 

almost surely, so that q ~ x| if Var Xi = Var Y = Var Z = 1. Hence, up 
to a scaling factor, the density of Xi^ is given by fl3.34p . Next, let us perturb 
Mo to 

M = aCI + /3Mo, 

where a and (3 are constants, G is a standard normal random variable inde- 
pendent of Mo, and I is the identity matrix. The covariance of the diagonal 
elements of M is then computed to be p := a^ — Hence, to obtain a desired 
value of p, we may take a = a/(1 + p)/2 and (5 = ^/{l — p)/2. Glearly, the 
largest eigenvalue of M can then be expressed as 



>^i = \I^G+\p—^X,,o. (3.35) 
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At one extreme, p = —1, we recover Ai = Ai^o- -^t the other extreme, p — 1, 
we obtain Ai = Z . Midway between these two extremes, at p — 0, we have a 
standard GUE matrix, so that 

Ai = y|(G + Ai,o). 
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